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For any quantum algorithm operating on pure states we prove that the presence of 
M ' multi-partite entanglement, with a number of parties that increases unboundedly with 

input size, is necessary if the quantum algorithm is to offer an exponential speed-up 
over classical computation. Furthermore we prove that the algorithm can be classically 
QQ I efficiently simulated to within a prescribed tolerance rj even if a suitably small amount of 

global entanglement (depending on 77) is present. We explicitly identify the occurrence of 
O^l \ increasing multi-partite entanglement in Shor's algorithm. Our results do not apply to 

1^ . quantum algorithms operating on mixed states in general and we discuss the suggestion 

that an exponential computational speed-up might be possible with mixed states in the 
total absence of entanglement. Finally, despite the essential role of entanglement for 
pure state algorithms, we argue that it is nevertheless misleading to view entanglement 
. as a key resource for quantum computational power. 
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^ ' 1 Introduction 

. Quantum computation is generally regarded as being more powerful than classical computa- 

\ tion. The evidence for this viewpoint begins with Feynman's pioneering observation that 

the simulation of a general quantum evolution on a classical computer appears to require an 
exponential overhead in computational resources compared to the physical resources needed 
for a direct physical implementation of the quantum process itself. Subsequent work by 
^ \ Deutsch [||, Bernstein and Vazirani j^, Simon Grover ||5|, Shor |^ and others showed 

■ how quantum evolution can be harnessed to carry out some useful computational tasks 

more rapidly than by any known classical means. Perhaps the most dramatic such result 
is Shor's quantum algorithm for integer factorisation which succeeds in factoring an integer 
of n digits in a running time that grows less rapidly than O(n^) whereas the best known 

1 2 

classical algorithm is exponentially slower (with running time 0(exp(n3 logras))). Thus for 
some computational tasks (such as factoring) quantum physics appears to provide an expo- 
nential benefit but for other tasks (such as satisfiability or other NP complete problems Q) 
the quantum benefits appear to be inherently more restricted (giving perhaps at most a 
polynomial speedup). 

The concept of computational power provides a fundamentally new language and set of 
tools for studying the relationship between classical and quantum physics. Indeed it is of 
great interest to try to characterise from this point of view, the essential non-classical ingre- 
dients that give rise to the enhanced quantum computational power. Also an understanding 
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of the limitations of quantum computational power (such as the apparent lack of an efficient 
solution of an NP complete problem) may provide insights into the strange architecture of 
the quantum formalism, perhaps even leading to physical principles, formulated in terms of 
the concept of computational complexity, that may guide the development of new physical 
theories by severely restricting acceptable forms of proposed new formalisms. Indeed arbi- 
trarily created "toy" physical theories (including most proposed non-linear generalisations 
of quantum theory [^) tend to engender immense computing power. The apparently lim- 
ited nature of quantum computational power makes quantum theory rather atypical so this 
observation is probably significant. 

One fundamental non-classical feature of the quantum formalism is the rule for how 
the state space Sab of a composite system AB is constructed from the state spaces Sa 
and 5_B of the parts A and [11|. In classical theory. Sab is the cartesian product of 
Sa and Sb whereas in quantum theory it is the tensor product (and the state spaces are 
linear spaces). This essential distinction between cartesian and tensor products is precisely 
the phenomenon of quantum entanglement viz. the existence of pure states of a composite 
system that are not product states of the parts. 

In quantum theory, state spaces always admit the superposition principle whereas in 
classical theory, state spaces generally do not have a linear structure. But there do exist 
classical state spaces with a natural linear structure (e.g. the space of states of an elastic 
vibrating string with fixed endpoints) so the possibility of superposition in itself, is not a 
uniquely quantum feature. 



In 1 11, 13 1 it was argued that there is a relationship between entanglement and the 



apparent ability of a quantum process to perform a computational task with exponentially 
reduced resources (compared to any classical process). We briefly recall the two main 
points made in |pT| , [T^ . The first point concerns the physical resources needed to represent 
superpositions in quantum versus classical theory. To represent a superposition of 2" levels 
classically, the levels must all correspond to a physical property of a single system (with no 
subsystems) as classical states of separate systems can never be superposed. Hence we will 
need an exponentially high tower of levels and the amount of the physical resource needed 
will grow exponentially with n. In contrast, in quantum theory because of entanglement, 
a general superposition of 2" levels may be represented in n 2-level systems. Thus the 
amount of the physical resource (that defines the levels) will grow only linearly with n (i.e. 
the number of 2-level systems). In the classical case one may attempt to circumvent the 
need for an exponential tower by considering a linear system with infinitely many levels 
that accumulate below a finite upper bound. In this way we could represent exponentially 
growing superpositions with only a constant cost in the physical resource. But now the levels 
must crowd exponentially closely together and we will need to build our instruments with 
exponentially finer precision. This again will presumably require exponentially increasing 
physical resources. 

This first point may be expressed in more computational terms as follows. Any positive 
integer may be represented in unary or binary. The unary representation (being a string 
of I's of length A^) is exponentially longer than the binary representation (having length 
log A). We may equivalently take the unary representation as a string of (A — 1) O's 
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followed by a single 1, which we view as a single mark at height N. Now consider physical 
implementations of these representations of numbers. The unary representation corresponds 
to the N^^ level of a single system and hence we can superpose unary representations 
of numbers in either classical or quantum physics. For binary numbers we can exploit 
the compactness of the representation by using logA^ 2-level systems. In that case we 
can superpose these representations in quantum theory but not in classical theory. In 
summary, physical representations of binary number exist in both classical and quantum 
systems but only in the quantum case can these representations be superposed. This is 
precisely the phenomenon of quantum entanglement. If we wish to perform computations 
on superpositions of numbers in a classical setting then this is possible but we must use 
the exponentially more costly unary representation i.e. the quantum formalism offers a far 
greater potential power for computations in superposition. 



The second point made in |11] concerns the classical computational cost of mimicking 
a typical step in a quantum computation, which we epitomise as follows. Suppose that at 
some stage of a quantum computation the state is an n-qubit (generally entangled) state 
0'i\...in Ki) • • • l^n) and suppose that the next step is the application of a 1-qubit 
gate U to the first qubit. The updated state is then \a') = J2(^iii2...i„ Ni) • ■ ■ Hn) where 



/ 
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and Uij are the matrix elements of U. This is only one step of quantum computation, on 
only n qubits, but to classically compute the updated state description via eq. (|l|) we would 
require exponentially many arithmetic operations and also the classical description of the 
state itself is exponentially large (compared to the compact n qubit quantum physical rep- 
resentation). Now the point is that these exponential gaps between classical and quantum 
resources can be connected to the concept of entanglement as follows. If the state |a) were 
unentangled i.e. if ai^...j„ is given as a product ai^^bi^ ■ ■ - di^ then for the classical represen- 
tation, both the state description and the computation of its update become polynomially 
sized, with 0{n) resources for the state description and a constant amount of computation 
for the update, and these are now equivalent to the corresponding quantum resources. This 
suggests that if entanglement is absent in a quantum algorithm then the algorithm can be 
classically simulated with an equivalent amount of classical resources. Stated otherwise, 
if a quantum algorithm (on pure states) is to offer an exponential speedup over classical 
algorithms, then entanglement must appear in the states used by the quantum algorithm. 

Our discussion above has been qualitative and we have glossed over various significant 
issues. Firstly, the terms in eq. (|lj) are complex numbers and from a computational point 
of view they have potentially infinite sized descriptions. Hence they will need to be re- 
stricted to, or replaced by, suitably finitely describable numbers (such as rationals) if we 
are to avoid a potentially prohibitively large cost for the classical computation of individual 
arithmetic operations. The second issue concerns how much entanglement and what type 
of entanglement is required if a quantum algorithm is to offer exponential speed up over 
classical algorithms. This is a main concern of the present paper. Let p be any fixed pos- 
itive integer. A state of n qubits will be called p-blocked if no subset of p -|- 1 qubits are 
entangled together. Thus a p-blocked state may involve a large amount of entanglement 
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but this entanglement is restricted, in its multi-partiteness, to sets of a most p qubits. Now 
suppose that a quantum algorithm has the property that the state at each step is pure 
and p-blocked. We note that the qubits in each block can vary from step to step in the 
algorithm; indeed each qubit can be entangled with every other qubit at some stage of the 
computation. Then we will show that again the algorithm can be classically simulated with 
an amount of classical resources that is equivalent to the quantum resources of the original 
algorithm (i.e. the classical resources needed are polynomially, but not exponentially, larger 
than the quantum resources). 

We note that, in contrast, in communication tasks, entanglement restricted to merely 
bi-partite entanglement suffices for exponential benefits (for example an exponential reduc- 
tion of communication complexity [^] ) but for "standard" computational tasks our results 
show that the availability of (even increasing amounts of) only bi-partite entanglement can- 
not give an exponential speed-up (despite the fact that 2-qubit gates suffice for universal 
computation). Also our results show that a distributed quantum computer, which has any 
number of quantum processors but each with bounded size and only classical communica- 
tion between them, cannot offer an exponential speed up over classical computation - if the 
local processors have size up to p qubits each then the state will be p-blocked at every stage. 

In section ^ we will prove our main result, that for any quantum algorithm operating 
on pure states, the presence of multi-partite entanglement, with a number of parties that 
increases unboundedly with input size, is necessary if the quantum algorithm is to offer an 
exponential speed-up over classical computation (theorem |l|). Furthermore we will prove 
that the algorithm can be classically efficiently simulated to within a prescribed tolerance 
77 even if a suitably small amount of entanglement (depending on 77) is present (theorem 
|2|). The presence or absence of increasingly widespread multi-partite entanglement is not 
an obvious feature of a given family of quantum states and in section ^ we will explicitly 
identify the presence of this resource in Shor's algorithm. 

The theorems of section ^ apply to quantum algorithms that operate on pure states which 
are required to be p-blocked (for some fixed p) at every stage. As such, the theorems actually 
also apply to quantum algorithms on mixed states, where the state is again required to be p- 
blocked at each stage (i.e. any such algorithm can be classically efficiently simulated). Now, 
for pure states, the p-blocked condition corresponds exactly to the absence of entanglement 
(of more than p qubits) but this is no longer true for mixed states: a mixed state p is 
p-blocked if it is a product of mixtures (cf definition |^ whereas p is unentangled if it is 
separable i.e. a mixture of products (of p-qubit states) and such /o's are not necessarily 
p-blocked. Thus for mixed states, the p-blocked condition is considerably stronger than 
the condition of absence of entanglement. This leads to the following question: suppose 
that a quantum algorithm operates on general mixed states and at every stage the state is 
unentangled (in the sense of being separable). Can the algorithm be classically simulated 
with an amount of classical resources that is (polynomially) equivalent to the quantum 
resources of the original algorithm? This fundamental question remains unresolved. In 
section ^ we will discuss some essential differences between pure and mixed unentangled 
states which suggest a negative answer (in contrast to the case of pure states) i.e. that an 
exponential computational speed-up might plausibly be achievable in quantum algorithms 
on mixed quantum states that have no entanglement. 
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According to our main results above we can say that entanglement is necessary in a 
quantum algorithm (on pure states) if the algorithm is to offer an exponential speed-up 
over classical computation. Does this mean that entanglement can be identified as an 
essential resource for quantum computational power? In section ^ we will suggest (perhaps 
somewhat surprisingly) that this is not a good conclusion to draw from our results! Indeed 
our theorems are based entirely on the idea of mathematically representing quantum states 
as amplitudes and then classically computing the quantum evolution with respect to this 
mathematical description. From our computational point of view entanglement emerges 
as a significant property because absence of entanglement implies a polynomially sized 
description of the quantum process, when the process is mathematically expressed in the 
amplitude description. But suppose that instead of the amplitude description we choose 
some other mathematical description T> of quantum states (and gates). Indeed there is a 
rich variety of possible alternative descriptions. Then there will be an associated property 
propiV) of states which guarantees that the P-description of the quantum computational 
process grows only polynomially with the number of qubits if prop(T>) is absent (e.g. if P is 
the amplitude description then propiT)) is just the notion of entanglement) . Thus, just as for 
entanglement, we can equally well claim that prop(T>) for any D, is an essential resource for 
quantum computational speed-up! Entanglement itself appears to have no special status 
here. In summary, the point is that we can have different mathematical formalisms for 
expressing quantum theory, and although they are fully mathematically equivalent, they 
will lead to quite different families of states (of increasingly many qubits) that have a 
polynomially sized description with respect to the chosen formalism. Hence we also get 
different notions of a physical quality that guarantees a state will not be of the latter form. 
Then every one of these qualities must be present in a quantum algorithm if it is to offer 
an exponential speed-up over classical algorithms. 



2 Preliminary definitions 

We will need a precise definition of the notion of a quantum computational process and a 
definition of what it means to classically efficiently simulate such a process. 

The term 'state' will be used to mean a general (mixed) state. The term poly(n) will 
refer to a function /(n) whose growth is bounded by a polynomial function in n i.e. there 
exists a polynomial p{n) such that /(n) < p{n) for all sufficiently large n. 

We adopt the gate array model of quantum computation as our working definition. Let 
^ be a fixed finite set of 2-qubit gates. 

Definition 1 A quantum computational process (or quantum algorithm) with running 
timeT{n) comprises the following description. For each fixed positive integer n (input size) 
we have a sequence of triples 

X = {(C^io,«0,^o), (t^n,Ol,^l), • • • , (t^iT{„)>«T(n),^T(n))} (2) 

where the Ui. 's are chosen from Q and and bk are positive integers. More precisely, there 
is a classical algorithm running in poly(r(n)) time, which givcTi Ti, will output thG list 
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The quantum algorithm corresponding to the sequence {An-' n = 1,2,...} runs as fol- 
lows. For each input ii . . .in of size n we start with a row of qubits \ii) \i2) ■ ■ ■ \in) |0) |0) . . . 
giving the input extended by zeroes. We apply the sequence of T{n) computational steps 
given in An, where the k^^ step is the application of the 2-qubit gate Ui^ to qubits (afc,&fc) 
in the row. After T{n) steps we measure the leftmost qubit in the computational basis and 
output the result (0 or 1) i.e. we give a sample from the probability distribution V = {po,Pi} 
defined by the quantum measurement on the final state. 

Remark Later we will also discuss quantum computational processes on mixed states. For 
such a process of T(n) steps we will require that the input state is a mixed state of n qubits 
with a poly(n) sized description (rather than just a computational basis state of n qubits as 
above). Also the computational steps could be unitary or more generally, trace preserving 
completely positive maps on two qubits. Equivalently we could require the computational 
steps to be unitary transforms of six qubits (i.e. having a 4-qubit ancilla space) whose 
locations are all specified similar to the pairs of qubits in the above definition. 

We will also need a notion of distance between states and between probability distri- 
butions. For this purpose we will use the trace norm[^. For any operator A (on a finite 
dimensional state space), the trace norm \ \A\\ is defined by 
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where fii are the singular values of A. If A is hermitian then //j are the absolute values of 
the eigenvalues. If /) is a density matrix then ||yo|| = 1. 

The trace norm distance \\p — cr\\ between states has an especially useful property of 



being contractive under any trace preserving quantum operation |18|. In particular if pab 



and aAB are bipartite states and pA^ cta denote the corresponding reduced states of A then 

ll^iA - o"a|| < Wpab - i^abW- 

Also if V and Q are the probability distributions for the outcomes of any quantum mea- 
surement on two states p and cr respectively then 

||7^- Qll < (3) 

where HP — Q|| ='}2\Pi~ is the trace norm distance between the distributions viewed 
as diagonal states. 

Finally we give our definition of the notion of classical efficient simulation. 

Definition 2 A quantum computation {An : n = 1, 2, . . .} with output probability distribu- 
tion V can be efficiently classically simulated if the following condition is satisfied: 
Given only classical means (i.e. a universal classical computer which is also able to make 
probabilistic choices) and the description An (i.e. the classical poly-time algorithm for gen- 
erating An from n), then for any rj > 0, we are able to sample once a distribution V' with 
W'P — 'P'W < using a classical computational effort that grows polynomially with n and 
log(|). 
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In the above definition we should be more precise about the computer's ability to make 
probabilistic choices. Just as computational steps cannot be arbitrary, but must be chosen 
from a fixed finite set of gates (giving a measure of computational effort for any desired 
transformation) we need to have a measure of how much computational effort is required 
to sample a given probability distribution {po,pi}- We assume that the computer can only 
toss a fair coin i.e. sample the probability distribution {prob{0) = ^,prob{l) = i} and 
this counts as a single computational step. Let < < 1 be any number whose binary 
expression has at most n binary digits: x = 0.ziZ2 ■ ■ - in- Then the computer can sample the 
distribution = {x, l — x} in n steps as follows: toss the fair coin n times giving a sequence 
of results ji . . . jn- View ji . . . j„ as an n digit binary number (and similarly 
Then prob{ji . . . jn < ii ■ ■ ■ in) = x so we get a sampling of Vx by comparing the random 
output ji ■ . . jn to the given i\ . . . in- If x has an infinite binary expansion we can sample an 
n digit approximation to Vx by the above method using poly(n) steps i.e. we can sample 
V' havingllP' — V\\ < rj, with poly(log 1/??) computational effort and we adopt the latter 
simulation rate as the definition of efficient simulability for a general probability distribution. 

In many applications we do not need to consider arbitrarily small r] as in definition 
|2| and a weaker simulation requirement suffices. Suppose that the quantum computation 
is a BQP algorithm for a decision problem. Thus for any input, the output distribution 
V = {po,Pi} has the property that the probability of obtaining a correct answer is > |. 
In that case the decision problem will have a classical efficient (BPP) algorithm if we can 
efficiently classically simulate a distribution V' = {Po,Pi} with ||'P — "P']] < where % < | 
is a constant, so that the probability of a correct answer with V' is still bounded away from 
^. Indeed we will see (theorem § below) that for such a finite tolerance classical simulation 
to be ruled out, not only must the quantum algorithm exhibit multi-partite entanglement 
in its states but furthermore the amount of this entanglement must be suitably large (with 
a lower bound depending on rjQ and the running time of the algorithm). 

3 Simulation by classical computation 

One method for classically simulating a quantum computation is to directly compute the 
state at each step from the sequence of unitary operations prescribed in the quantum al- 
gorithm. We will investigate the implications of this simulation for the power of quantum 
computing compared to classical computing, especially noting conditions which guarantee 
that this simulation is efficient. 

Let \aj) be the state after j steps of computation, which we may assume is a general state 
of at most 2j qubits (by neglecting unused qubits from the initial row) . In the computational 
basis we have: 



Then is obtained by applying the 2-qubit gate Ui- to qubits aj and bj. For clarity 

let us assume that these are the first two qubits i.e. aj = 1 and bj = 2 (all other possible 
cases are similar). The amplitudes d of the updated state are calculated as: 




(4) 





(5) 



iij2=o,i 
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where Mf^l^ is the 4 by 4 matrix of Ui- . 

This classical computation may fail to be efficient for two reasons. Firstly there are ex- 
ponentially many amplitudes that need to be computed and the matrix multiplication of M 
in eq. (|5|) needs to be carried out exponentially many times. This inefficiency is intimately 
related to the fact that the states \aj) are generally entangled and the implications of this 
obstacle to efficient simulation will be elaborated below. 

The second possible difficulty with the computation in eq. (|5[) arises from that fact that 
the matrix entries of M are generally continuous parameters (real or complex numbers) 
so that even the individual arithmetic operations (additions and multiplications) involved 
might be prohibitively costly. This second issue will be circumvented by considering rational 
approximations to gates. 

Definition 3 A quantum gate is rational if its matrix elements (in the computational 
basis) have rational numbers as real and imaginary parts. 

The main property of rational numbers that we will need is the following. 

Lemma 1 Let V = {ri, . . . ,ri} be a finite set of rational numbers whose numerators and 
denominators have at most m digits. For any j let x be an arithmetic expression constructed 
from the elements of T> using at most j additions and multiplications. Then x can be com- 
puted exactly (as a rational number) using a number of steps of computation that grows 
polynomially with j and m. 

Proof This is an easy consequence of the polynomial (in the number of digits) computabil- 
ity of integer arithmetic and of the fact that the number of digits of the numerator and 
denominator of an arithmetic expression in the rationals in T) grows at most linearly with 
the number of operations. QED. 

Apart from rational gates, many other possible classes of gates, having the essential 
polynomial property of lemma |l], would suffice for our purposes. For example we could 
allow the matrix elements to be members of a finite algebraic extension of the rationals. 
This would allow numbers such as and cosvr/S which appear as matrix elements of 
commonly used universal sets of gates jl8]. 

Let us now return to the relation of entanglement to the efficiency of the classical 
computation in eq. (|5[). 

Definition 4 Let p be a state ofm qubits where the qubits are labelled by B = {1,2, ... , m}. 
p is p-blocked if B can be partitioned into subsets of size at most p: 

B = BiU B2U . . .U Bk \Bi\<p 
and p is a product state for this partition: 

p = Pi® P2® ■ ■ - ^ Pk 

where pi is a state of the qubits in Bi only. 

Note that a pure state is p-blocked if and only if no p + 1 qubits are all entangled 
together. But a mixed state p can be unentangled in the sense of being separable, without 
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being p-blocked i.e. the p-blocked condition ("product of mixtures") is stronger than the 
separabihty condition ("mixture of products"). 

Our next result transparently shows the necessity of multi-partite entanglement for 
exponential quantum computational speed-up in a restricted situation where there are no 
additional complications arising from the description of the gates (as they are assumed to 
be rational). 

Lemma 2 Let Q be a finite set of rational 2-qubit gates and p a fixed positive integer. 
Suppose that {An : n = 1,2, . . .}, using gates from Q, is a polynomial time quantum com- 
putation with the property that at each stage j = 1, . . . ,poly(n) the state \aj) is p-blocked. 
Then the final probability distribution V can be classically exactly computed with poly(n) 
computational steps so the quantum computation can be classically efficiently simulated. 

Proof of lemma ^ Any p-blocked pure state \ip) of m qubits may be fully described with 
the following data: 

(a) (Block locations) A list of m integers (6i, . . . , bm) where 1 < bi < m. bi gives the number 
of the block to which the qubit belongs. For example the list (3,5,4,3,3, . . .) denotes 
that qubit 1 is in block 3, qubit 2 is in block 5 and so on. Note that the number of blocks 
can grow at most linearly with m. 

(b) (Block states) For each block we give its state by listing the amplitudes in the compu- 
tational basis. This requires at most 2^"*"^ real numbers since each block has size at most p 
qubits. 

Note that for fixed p and increasing m the total size of the description grows only poly- 
nomially with m assuming that the real numbers in (b) can each be described with poly(m) 
bits of memory. This is in contrast to the exponentially growing number of amplitudes 
needed to describe a general entangled state. 

Now to classically simulate the p-blocked algorithm we simply update the ((a),(b)) de- 
scription of the state at each step. Note that the location of the blocks (i.e. (a)) will 
generally change as well as the states of the blocks themselves (i.e. (b)). The j^^ com- 
putational step is given by the action of a 2-qubit gate Ui- on a p-blocked state and we 
distinguish two cases: 

Case 1: the gate acts on two qubits which are already in the same block. Thus (a) remains 
unchanged and the state of the chosen block is updated in (b) by applying the unitary 
matrix of size at most 2^ x 2^, requiring a constant number of arithmetic operations (which 
does not grow with j, the counter describing the step of the quantum algorithm that is 
being simulated). 

Case 2: the 2-qubit gate straddles two existing blocks Bi and B2 of sizes pi < p and P2 < p 
respectively. We again update the state of all pi + p2 qubits by applying a unitary matrix 
of size at most 2^^ x 2^^ (requiring a constant number of operations that does not grow 
with j). If pi + p2 < p we also update (a) by amalgamating the two block labels into a 
single label, to complete the step. If pi +p2 > p we need to identify a new block structure 
(with blocks of size < p) within the qubits of Bi and B2. One method is to compute the 
reduced state px of every subset X O Bi U B2 having size < p, and compare the global 
state of Bi U B2 with pxi ... (8) PXk each partition Xi, . . . , Xk of Bi U B2 (looking for 
an equality of states). This calculation again clearly needs a bounded number of arithmetic 
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operations (independent of j). Finally we update (a) and (b) with the newly found blocks 
and their corresponding states. 

This gives a classical simulation of the quantum algorithm using a number of rational 
arithmetic operations that grows linearly with j. Finally lemma || guarantees that the 
calculation can be done with poly(j) elementary computational steps, giving an efficient 
classical computation of the final state of the quantum algorithm. Finally we identify the 
block containing the leftmost qubit, compute the probability distribution V and sample it 
once with a corresponding classical probabilistic choice, completing the efficient classical 
simulation of the quantum algorithm. QED. 

Remark Although lemma || has been stated for quantum algorithms with pure states it is 
readily generalised to the case of mixed states: suppose that a quantum algorithm (with 
rational gates) has ap-blocked mixed state at each stage. Then it can be classically efficiently 
simulated. Indeed the above proof does not require the block states to be pure. In a similar 
way, the theorems below also easily generalise to p-blocked mixed state processes (although 
for clarity we give the statements and proofs only for the case of pure states). 

Lemma |2| depends on two essential ingredients: (a) p-blockedness implying a polynomial 
number of parameters for state descriptions and (b) rationality of gates, guaranteeing that 
the classical arithmetic operations can be efficiently carried out. Our next result shows that 
the condition (b) can be lifted. 

Theorem 1 Let Q he a finite set of 2- qubit gates and p a fixed positive integer. Suppose 
that {An : n = 1,2, ...}, using gates from Q, is a polynomial time quantum computation 
with the property that at each stage j = 1, . . . ,poly(n) the state \aj) is p-blocked. Then the 
quantum computation can be classically efficiently simulated. 

Before going on to consider the proof of theorem |I| we make some remarks on the sig- 
nificance of this result. Theorem ^ shows that multi-partite entanglement with unboundedly 
many qubits entangled together, is a necessary feature of any quantum algorithm (oper- 
ating on pure states) if the algorithm is to exhibit an exponential speed-up over classical 
computation. Indeed absence of increasing numbers of entangled qubits corresponds to a 
fixed value of p. In contrast, in communication tasks, entanglement restricted to merely bi- 
partite entanglement suffices for exponential benefits (for example an exponential reduction 
of communication complexity |]l^) but for "standard" computational tasks the availability 
of (even increasing amounts of) only bi-partite entanglement cannot give an exponential 
speed-up (in contrast to the fact that 2-qubit gates suffice for universal computation) — 
the number of qubits entangled together must grow as an unbounded function of input size. 
Indeed even if every pair of qubits become entangled at some stage of the computation and 
there is no higher order entanglement, the quantum algorithm will still have an efficient 
classical simulation. This shows that the role of entanglement in computation is essentially 
different from its role in communication. Theorem |l] also implies that distributed quantum 
computing (on pure states), which allows any number of local quantum processors, but of 
bounded size and classical communication between them, cannot offer an exponential speed- 
up over classical computation - if the local processors have size up to p qubits each then 
the state will be p-blocked at every stage. 

Our approach to proving theorem || will be to replace general gates by rational approx- 
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imations with increasingly high precision. Recall [10| that there exists a finite universal set 
of rational 2-qubit gates. Hence any quantum computation QCi with gate set Gi = {Ui} 
can be efficiently approximated by a quantum computation QC2{e) having rational gates 
Q2 = {Ui} where ||f7j — f/i|| < e (for any chosen e > 0). As far as a quantum physical 
implementation is concerned, QC2 behaves very similarly to QCi in its action on states 
||T5|| . But for our classical computational simulations there can be a dramatic difference: if 
QCi is a p-blocked computation (and hence efficiently simulable by the theorem) then the 
approximation QC2 will generally not be p-blocked so we cannot invoke the lemma to claim 
that QC2 (although near to QCi and having rational gates) is efficiently simulable. Indeed 
the fact that p-blocked states of m qubits have a poly(m) sized exact description (which is 
crucial in the proof of lemma P) is immediately lost under arbitrarily small perturbations 
to general states, which require an exponentially large description (of exponentially many 
independent amplitudes). Hence the theorem is not a straightforward consequence of the 
lemma via efficient rational approximation of the gates, and as will be seen, its proof will 
require substantial further ingredients. Our strategy will be to further modify QC2 to a 
nearby process QC2 whose states are p-blocked transmogrifications of the states of QC2- 
This leads us to consider the simulation of algorithms whose states remain suitably close 
to p-blocked states, so they may have a small amount of entanglement between the blocks. 
Consequently we will develop suitable approximations with polynomially sized descriptions, 
to states that are not p-blocked, but are suitably near to p-blocked states. This is the content 
of theorems |2| and ^ below (and theorem || will appear as a special case). 

Although we are using the gate array model of quantum computation for our arguments, 
our results will apply to other models as well. Suppose we have any model of quantum com- 
putation with the following properties: 

(a) At each stage of the computation we have a pure state of a system comprising subsys- 
tems of a bounded size; 

(b) The update of the state is effected by a unitary transform or by a measurement, each 
on a bounded number of subsystems. 

In that case our proofs readily generalise to show that if the states are p-blocked at every 
stage then the computational process will have an efficient classical simulation i.e. that 
multi-partite entanglement of unboundedly many subsystems must be present for an expo- 
nential computational speed-up. The above criteria for a computational model are satisfied 
by the quantum turing machine model [H] as well as some recently proposed models that 



focus on measurement operations [16, 17] 



As mentioned above we will prove theorem Q in a more general form motivated as follows. 
Let \aj) be the pure state at step j in a quantum algorithm. Write aj as an abbreviation 
for \aj){aj\. Theorem || requires that Oj be exactly p-blocked for each j. Suppose that 
aj is not p-blocked but is made up of blocks of size at most p with a "small" amount of 
entanglement between the blocks. We formalise this by requiring that for each j there is a 
p-blocked (possibly mixed) state Pj close to aj: 

— < €• (6) 

Theorem |l] states that if e = then a classical efficient simulation exists. Our question 
now is: how large can e be in eq. (P) so that the quantum algorithm can still be efficiently 
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classically simulated? i.e. we wish to study the stability of efficient simulability under small 
perturbations of the p-blockedness condition. The following theorem says that if we accept 
an error tolerance r] in the output probability distribution, then the quantum algorithm will 
have an efficient classical simulation (to within the tolerance) for nonzero e exponentially 
small in the running time (e ~ rjc^ for < c < 1). 

Theorem 2 Let Q be a finite set of 2-quhit gates and p a fixed positive integer. Suppose 
that {An ■ n = 1, 2, . . .} using gates from Q, is a polynomial time quantum computation with 
running time T = poly(n). Let V he the output probability distribution of the algorithm. 
For j = 0, . . . ,T let \aj) denote the state at stage j and write \aj){aj\ = aj. 
Suppose that the states aj are not exactly p -blocked hut there exists a sequence of p-blocked 
states Pj (generally depending on n too) such that 

\\aj-^,\\<e. (7) 

(The identities of the states (3j are not assumed to be known). 

Then for any rj > 0, if e < j^2p+4}^' ^'^^ classically sample a distribution V' with 
WV — V'W < r] using poly (T, log l/r?) classical computational steps. 

Theorem |l] is an immediate consequence of theorem ^ (by taking e = in the latter 
theorem) . To prove theorem ^ we first consider an analogue (theorem ^ below) with rational 
gates that may vary with n. Then our strategy for theorem |^ will be to replace gates 
by rational approximations, but these approximations will need to become suitably more 
accurate as n increases. Theorem ^ may also be viewed as a generalisation of lemma ^, 
allowing a small amount of inter-block entanglement. If the states aj in lemma |2| are only 
suitably close to being p-blocked (rather than being exactly p-blocked) then a probability 
distribution V' suitably close to V can still be efficiently calculated. 



Theorem 3 Suppose that {An '■ n = 1,2,...} is a polynomial time quantum computation 
with running time T = poly(n) and using only rational gates. Let m (which may grow with 
n) be the largest number of digits of the denominators and numerators of the rational gates 
used in {Ak : 1 < k < n}. Let V be the output probability distribution of the algorithm. For 
j = 0, . . . ,T let \aj) denote the state at stage j and write \aj){aj\ = aj. 
Suppose that the states aj are not exactly p-blocked but there exists a sequence of p-blocked 
states Pj (generally depending on n too) such that 

\\aj-Pj\\<e. (8) 

(The identities of the states (3j are not assumed to be known). 

Then for any r] > 0, if e < (2p+4)' - J ' ' ""^^ '-^'^ classically sample a distribution V' with \ \V — 
V'W = r] using poly(T, m) classical computational steps. 
Proof of theorem ^ We have a computational process 

\ao) = \ii) . . . \in) \0) \0) . . . \aj+i) = Uj\aj) j = 0, . . . ,T 

where each Uj is a rational 2-qubit gate. Using the existence of the sequence Pj we will 
show that there is a sequence of states pj and numbers ej with the following properties: 
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(a) \\pj - ajll < Cj. 

(b) pj is p-blocked. 

(c) Pj can be classically computed in poIy(j, m) steps. 

(d) eo = and e^+i < {2p + 3)(ej + e) . 

(Note that the p-blocked states Pj are not assumed to be known and they will not generally 
satisfy the crucial condition (c)). 

Assuming all the above is given, we see from (d) that ei < (2p+3)e and Cj+i < (2p+4)ej 

so 

ej < e{2p + 4y. 

Hence, given t], if we choose any e < r]{2p + 4)~^ we will have 

I \pt — otI I < < f?. 

Consequently by eq. (^: 

\\v -v'W < 1] 

where V' is the probability distribution arising from a quantum measurement on px- Hence 
by (c), T" can be efficiently classically sampled, as required. 

The states pj are calculated sequentially as follows. Set po = oq so ||po — oo|| < = 0. 
Suppose we have generated a p-blocked state pj for the j^^ step with 

\\pj-aj\\ < ej. 

We construct Pj+i as follows. Let r = UjPjUj, with the 2-qubit rational gate Uj acting on 
qubits from C = C1UC2 where Ci and C2 are blocks in pj (so |C| < 2p). r might not be ap- 
blocked state if |C| > p but outside of C r remains p-blocked because pj was p-blocked (and 
r and pj agree outside C). Since Oj+i = UjajUj too, we have ||aj+i — t|| = — < ej 
so 

\\t - Pj+i\\ < ej + e. (9) 

For any subset X of qubit positions let tx (respectively pxj and Px{j+i)) denote the reduced 
state of X in r (respectively pj and We aim to decompose C into K <2p blocks Ei 

of size at most p so that ||rc — te-^ (8> . . . (8> te^|| remains suitably small. 

Now j3j+i is p-blocked so the reduced state /^c-q+x) is p-blocked too. Hence (3c[j-\-i) = 
Poiij+i) ® . • • ® PdkU+i) where C = Di U . . . U Dk, \Di\ < p and K < 2p. From eq. (^) we 
have \ \Pc(j+i) - TcW <e.j + e and 

\\PD,{j+i) - tdJI < ej + e for z = 1, . . . ,K. 

Thus (cf the hybrid argument of p5|]) 

l|/?Di{i+i) • • • /Jda-O'+i) - (g) . . . (g) TD^II < i^(ej + e) < 2p{ej + e) 

and using the triangle inequality 

\\tc - TDi® ■ ■ TDj^W < \\tc - Pc{j+1)\\ + \\Pc(j+l) - Poiij+l) ^ ■ • -/^DA'a+l)!! 

+ 1 IPdiU+I) ® • • • /?DA'(i+l) - TDi . . . TDji 1 1 

< {ej + e) + + 2p{ej + e) 
= (2p + l)(e, +e). 

(10) 
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Hence there exists a partition of C such that the corresponding product of reduced states 
of r approximates tc to within (2p + l)(ej + e). Thus we compute || re — te^ ... te^\\ 
for all partitions C = -Ei U . . . U of C and choose one satisfying eq. (p!o|). Finally to get 
Pj+i we update tc by the chosen ... (8) r^;^ , giving a p-blocked state with 

< (2p + l)(ej- + e) + (ej+e)+e 

< {2p + 3){ej + e) 

i.e. ej+i < {2p + 3)(ej + e) as required. The entire calculation in updating pj to pj+i is 
carried out within a block C of size at most 2p using only rational arithmetic operations 
and the number of operations does not grow with j. Hence by lemma ||, pT is calculated 
using a number of computational steps that grows polynomially with T and m. QED. 

Proof of theorem ^ Let Uj be rational gates with 

\\Uj-Uj\\<C (11) 

(with ^ to be chosen later to match r/ and T) and the rational matrix elements of Uj have 
numerators and denominators with 0(log|) digits, which is always possible by eq. (11) 
(and the use of a universal rational 2-qubit gate |l^). Consider the process 

do = ao dj+i = UjCtjij] j = 0, 1, . . . , T - 1 (12) 

which we will compare to the process with Oj+i = UjajUj. 

If 1 1 dj — 1 1 = Cj , writing Aj = Uj — Uj we have \\Aj\\ < ^ and 

\\aj+i - aj+i\\ = \\{Uj + Aj)aj{uj + A]) - UjOjUjW 

= \\Uj{aj - aj)U] + A.ajU] + U.a^A] + Aja,A]\\ 
< Cj+i + i + e 

where we have used the properties = ||^|| and < ||^|| ||-B|| for any 

unitary U^V and arbitrary A^B. Thus Cj+i = ||dj+i — Oj+iH < ^j + 3^. Hence 

||«i-aill <3je i = 0,...,T-l. (13) 

If V (respectively V) is the output distribution of the final measurement of the quantum 
algorithm performed on oit (respectively ay) we get 

||p-p||<3r^. (14) 

Now dij are generated with rational gates but they are not necessarily p-blocked. However 
they lie close to the p-blocked states j3j. From ||aj — /9j|| < e and eq. (|l3|) we get 

\\oij-pj\\ < 3je + e < 3re + e 

and we can apply theorem ^ to claim the following: given any r/ > (and writing c = (2p+4) ) ' 
if ^ and e are chosen so that 

3T^ + e < ric^ (15) 
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then we can sample a distribution V' with H'P — "P'H < using poly (T, log |) classical 
computational steps. From eq. ( [l^ ) we have 

\\V - V'W < 3T^ + r] <r]c^ + r]< 2r]. 

Now we can satisfy eq. ( |T5|) by fixing ^ = ^rjc^ and letting e < ^?7c^ so log | = log ^ + 
poly(T) and poly(T, log|) = poly(T, log |). Finally replacing rj by r]/2 in the above, we 
see that for any r/ > 0, if e < |r/c^ then we can sample V' with \\V — V'\\ < rj using 
poly(T, log^) classical computational steps, as required. QED. 



4 Multi-partite entanglement in Shor's algorithm 

Shor's algorithm |Q is generally believed to exhibit an exponential speed-up over any classi- 
cal factoring algorithm. Thus, in the light of the arguments given above, one would expect 
that there is entanglement of an unbounded number of particles in Shor's algorithm. This 
is indeed the case as we now show. Of course, this does not show that there is no classical 
polynomial algorithm for factoring; however if it had been the case that only a bounded 
number of particles had been entangled for any value of the input size, then the results in the 
previous sections would have furnished a classical polynomial-time algorithm for factoring. 

To see that an unbounded number of particles becomes entangled in Shor's algorithm, 
it suffices to show that this happens at some point in the algorithm. The key construction 
of the algorithm is a method for determining the period r of the function 

f{x) = a"" mod N (16) 

where is the input number to be factorised and a < N has been chosen at random. 
Following a standard description of the algorithm (e.g. as given in |]l^) we see that at an 
appropriate stage we will have a periodic state of about 2 log A'' qubits, of the form 

Y.\xo + kr), (17) 

k 

where < xo < r is unknown and has been chosen at random (and we have omitted the 
normalisation factor). 

We will now show that, with high probability, these "arithmetic progression" states have 
unbounded numbers of particles entangled as A'^ increases. In order to see how the argument 
proceeds, consider first a case in which an arithmetic progression state is blocked. This case 
is the state (with r = 3) 

3 

I^Pi) = 51 |3 + 3A;r) = |3) + |6) + |9) + |12). (18) 

k=0 

Expressing each state in binary we have 

\APi) = lOsOalilo) + IO3I2I1O0) + II3O2O1I0) + II3I2O1O0). (19) 
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We have labelled the qubits with a subscript indicating the power of two to which it refers. 
We now re-order the qubits to make it easy to see that \APi) is blocked (into two blocks 
each containing two qubits): 



The demonstration that this state is blocked proceeded by identifying which qubits are 
in which block. Let us now consider a general arithmetic progression. Imagine that it can 
be p-blocked. Thus we can rearrange the ordering of the qubits so that the state may be 
written 



(|ai) + |a2) + . . . lap,)) (|6i) + jaa) + . . . \bp^)) ... (l^i) + jza) + • • • ; (21) 



For example if [02) is II3O1) then 02 = 8. Each term Oj in the first bracket will a binary 
expression where non-zero digits all lie in a given subset of positions (of size at most p) i.e. 
the block corresponding to the first bracket. Different round brackets correspond to disjoint 
such blocks of digit positions. 

We will arrange the terms in each round bracket in increasing order of the binary 
string labelling the state. The full state in the progression with lowest binary string is 
|ai)|6i) . . . I^i); i.e. the binary string is oi + 61 . . . + zi. The next lowest term in the pro- 
gression is oi + 61 . . . + zi + r. Thus one of the brackets must have the property that the 
difference between the two smallest binary strings is r. Let us say that this bracket is the 
one containing the |aj) (i.e. 02 — 01 = r). Remember that all the binary numbers in the 
superposition in that bracket must be expressible using a given set of up to p bits (where p 
does not increase with N). Now as N increases, so will the typical values of r. A typical 
r will contain the pair "10" at two adjacent places in its binary representation many times 
(typically one quarter of the time). For large enough r, this pair "10" will inevitably occur 
at binary positions not included among the (at most p) qubits representing the Oj. Thus it 
is not possible to choose a fixed number p so that oi and 02 = ai + r are both expressible 
using only a bounded number of binary digits. Thus the full arithmetic progression state 
will not be p-blocked in general. 

Therefore for general values of A^, the number we wish to factor, the state of the com- 
puter at this stage in the calculation is not p-blocked. Of course, as we have seen, certain 
carefully chosen truncated arithmetic progression states may be blocked (e.g. it is not dif- 
ficult to construct 2-blocked arithmetic progressions of length 2™ for any m) , but these are 
highly atypical. 

Actually our argument above shows more generally, that almost all states of the form 
\a) + |a + r) + \bi) + . . . + \bm) where 6j > a + r for all i, will not be p-blocked. Now Simon's 
algorithm |l^ involves states of the form |xo) + where xq and xi are general n-qubit 
strings. Hence this algorithm too utilises multi-partite entanglement of unboundedly many 
qubits. 



APi) 



IO3I1O2I0) + IO3I1I2O0) + II3O1O2I0) + II3O1I2O0) 

(|O3ll) + |l3Ol))0(|O2lo) + |l2Oo)). 



(20) 
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5 Computations with mixed states 



The significance of entanglement for pure state computations derives from the fact that 
unentangled pure states (or more generally p-blocked pure states) of n qubits have a de- 
scription involving only poly(n) parameters (in contrast to 0(2"') parameters for a general 
pure state) and consequently if entanglement is absent in a quantum computational process 
then the process can be efficiently classically simulated. 

The corresponding situation for mixed states is dramatically different: we define a mixed 
state p to be unentangled if it is separable i.e. among all the possible ways of representing 
p as a mixture of pure states, there is a mixture involving only product states. Then it may 
be shown |20, 21, 22| that unentangled mixed states have a non-zero volume in the space 
of all mixed states. As an explicit example |22| the n qubit state 



p = {l-e)^I + e^ (22) 

is unentangled for all mixed states ^ if e < ^ . Hence unentangled mixed states require the 
same number of parameters for their description as do general mixed states. Consequently if 
a quantum algorithm has unentangled (mixed) states at each stage then the classical simu- 
lation by direct computation of the updated state, fails to be efficient. From this parameter 
counting point of view an unentangled mixed state has the same capacity for coding infor- 
mation as a general mixed state so it is plausible that the computational power of general 
mixed (or pure) quantum states is already fully present in the restricted case of unentangled 
mixed states. Stated otherwise, we have the following fundamental (unresolved) question. 
Suppose that a quantum computation of N steps on mixed states has a separable state aj 
at each stage j. Suppose also that the starting state uq has a poly(A^) sized description. 
Then, can the algorithm be classically efficiently simulated? 

At first sight one might expect that the direct computational simulation method used 
in the previous section could be adapted to become efficient. Indeed a separable state 
is just a classical probabilistic mixture of unentangled pure states and we can efficiently 
simulate unentangled pure state processes so maybe we could just supplement the latter 
with some extra classical probabilistic choices. Indeed such a modification works in the 
restricted case of classical probabilistic processes i.e. processes in which the state at any 
stage is a probabilistic mixture of computational basis states. As a simple example consider 
a process involving n coins and the j'*^ step is to toss the j^^ coin. Then the complete state 
description grows exponentially with j (being a probability distribution over 2^ outcomes 
at state j). But to simulate the process (i.e. sample the final distribution once) we do not 
need to carry along the entire distribution - we just make probabilistic choices along the 
way (i.e. follow one path through the exponentially large branching tree of possibilities, 
rather than computing the whole tree and then sampling the final total distribution at the 
end). 

Unfortunately this idea appears not to work in the quantum case. Suppose that pi 
is an unentangled state, being a mixture of product states with probabilities pi i.e. 
pi = J2Pi Suppose that p2 = UpiU^ is also unentangled for a unitary operation 

U (the computational step). Let \r]i) = U\^). Then p2 is a mixture of state \7]i) with 
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probabilities pi but there is no guarantee that \r]i) are product states! As a simple example 
with 2 qubits let pi be an equal mixture of the product states = (|0) + |1)) |1) and 
= (|0) — |1))|1) (where we omit normalisation factors) and let U be the controlled 
NOT gate. Then by updating these pure state we get p2 as an equal mixture of two 
maximally entangled states |?7i) = |0) |1) + |1) |0) and \t]2) = |0) |1) — |1) |0). Although 
these component states are entangled, the overall mixture is unentangled, being equivalent 
to an equal mixture of the product states |0) |1) and |1) |0). The existence of a separable 
mixture is not a property of individual component states \rji) but a global property of the 
whole ensemble {\rji) ;pi}- Hence we cannot follow a single probabilistic path of pure states 
through a branching tree of possibilities since these pure states become entangled (and so 
the simulation becomes inefficient). At each stage we need to re-compute a new separable 
decomposition of the state. This computation generally requires knowledge of the whole 
ensemble and hence cannot be efficient in the required sense. 

Quantum computation with mixed states has attracted much attention in recent years 
because of the experimental implementation of quantum computation using liquid state 
NMR techniques |jl^ which utilises an interesting class of mixed states. The basic idea is 
to consider so-called pseudo-pure states of the form 

p={l-e)^I + e\^){^|;\ (23) 

(which occur in NMR experiments for suitably small e) . Then for any unitary transformation 
U the state UpW is also pseudo-pure with {ip) replaced by U\ijj). Hence given any pure 
state quantum algorithm, if we implement it on the pseudo-pure analogue of its starting 
state, the entire pure state algorithm will unfold as usual on the pure state perturbation 



e of 6q. (23). Furthermore if A is any traceless observable then from eq. (23) we get 

the expectation value 

(A) =tr Ap = eii^l A li;) (24) 
i.e. we obtain the average value of A in the pure state {ip) but the signal is attenuated by e. 



We have seen in eq.(22) that for sufficiently small e all pseudo-pure states are separable 
so we have the intriguing possibility of implementing any quantum pure state algorithm 
with its original running time, in a setting with no entanglement at all! Can this provide a 
computational benefit (over classical computations) in the total absence of entanglement? 
Although this not not known for general algorithms, it has been shown for Shor's algorithm 



(and structurally related algorithms) [^3[ and for Grover's algorithm |24] that the answer is 



negative: if the pseudo-pure states are required to remain separable at each stage in these 



algorithms then it can be proven [23, 24 1 that the value of e must decrease exponentially 
with input size. Consequently to obtain the output result reliably via an expectation value 
as in eq. (^) we either need to repeat the algorithm exponentially many times or else, use 
an exponentially increasing bulk of fluid in the liquid state NMR implementation. In either 
case the implementation becomes inefficient. Indeed equivalent possibilities are available 
in a purely classical setting. For example a classical algorithm for factoring N that test 
divides by all numbers up to \/iV can be run in polynomial time if we have an exponential 
bulk of parallel computers available, or else in exponential time on a single computer if we 
run the trial divisions sequentially. 
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6 Is entanglement a key resource for computational power? 



Recall that the significance of entanglement for pure state computations derives from the 
fact that unentangled pure states (or more generally p-blocked pure states) of n qubits 
have a description involving only poly(n) parameters (in contrast to 0(2") parameters for 
a general pure state) . But this special property of unentangled states (of having a "small" 
descriptions) is contingent on a particular mathematical description, as amplitudes in the 
computational basis. If we were to adopt some other choice of mathematical description 
for quantum states (and their evolution) then although it will be mathematically equiva- 
lent to the amplitude description, there will be a different class of states which now have a 
polynomially sized description i.e. two formulations of a theory which are mathematically 
equivalent (and hence equally logically valid) need not have their corresponding mathemat- 
ical descriptions of elements of the theory being interconvertible by a polynomially bounded 
computation. With this in mind we see that the significance of entanglement as a resource 
for quantum computation is not an intrinsic property of quantum physics itself but is tied 
to a particular additional (arbitrary) choice of mathematical formalism for the theory. 

Thus suppose that instead of the amplitude description we choose some other mathe- 
matical description V of quantum states (and gates). Indeed there is a rich variety of pos- 
sible alternative descriptions. Then there will be an associated property prop{'D) of states 
which guarantees that the P-description of the quantum computational process grows only 
polynomially with the number of qubits if prop{T>) is absent (e.g. if T> is the amplitude de- 
scription then prop(T)) is just the notion of entanglement). Thus, just as for entanglement, 
we can equally well claim that prop(T)) for any D, is an essential resource for quantum 
computational speed-up! Entanglement itself appears to have no special status here. 

An explicit example of an alternative formalism and its implications for the power of 
quantum computation is provided by the so-called stabiliser formalism and the Gottesman- 
Knill theorem |jl8|, |25[. The essential ingredients are as follows. (See the previous references 
for details and proofs). The Pauli group Vn on n qubits is the group generated by all 
n-fold tensor products of of the Pauli matrices ax,o'y,az, the 1-qubit identity operator / 
and multiplicative factors of -1 and i. Any subgroup /C of Vn may be described by a list 
of at most n elements which generate the subgroup, so any subgroup has a poly(n) sized 
description. We write /C = [gi, ■ ■ ■ ,gk] if ffii • • • , 9/c is a set of generators for /C. For each n 
some states \a) of n qubits have a special property that they are stabilised by a subgroup 

= [gi,---,gk] {k < n) of Vn i-e. \a) is the unique state such that gi\a) = \a) for 
i = 1, . . . , A;. For example |0) |0) -|- 1 1) |1) is the unique state stabilised by [gx ® Ox-, Oz ® o^. 

Let S be the class of all such states (for all n). States in S can be mathematically 
specified by giving a list of generators of the corresponding stabiliser subgroup i.e. states 
in S have a poly(n) sized stabiliser description. Then we have the following facts: 

(a) All computational basis states are in S\ 

(b) The following gates all preserve S and have simple (i.e. efficient) update rules for their 
effect on the stabiliser description of the states: 



the 1-qubit gates (on any qubit): 
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and the 2-qubit controlled-NOT gate (on any two qubits); 

(c) Outcome probabilities for a measurement in the computational basis are efficiently 
computable from the stabiliser description of the state; 

(d) Application of other gates (such as the Toffoli gate or ^ phase gate |jl^ which would 
provide universal computation with the gates in (b)) will generally transform states \a) in 
S to states lip) outside of S. Recall that any unitary transformation may be expressed as a 
linear combination of I,ax,cry and cr^ so we could introduce a stabiliser description of 1-0) 
as a suitable subalgebra of the group algebra of Vn ■ But this description of n-qubit states 
lijj) would not generally remain poly(n) sized if a computational process involves such more 
general gates. 

Prom (a), (b) and (c) we immediately get: 
Gottesman-Knill theorem: Any quantum computational process that starts with a com- 
putational basis state and uses only the gates in (b) above (so that the states remains in 
S at each stage) can be efficiently classically simulated (by computation in the stabiliser 
description) . 

Note that such computations can generate large amounts of multi-partite entanglement 
of unboundedly many parties (e.g. by repeated use of H and controlled-NOT) so that in 
contrast to the stabiliser formalism, if we use the amplitude formalism then the computation 
will have an exponentially growing description. 

Thus iiprop{S) denotes the property of a state that it does not have a polynomially sized 
stabiliser description, then we can claim that prop{S) is an essential resource for quantum 
computational power (since absence oi prop{S) implies efficient classical simulability) . 

The concept of the stabiliser description of a state (compared to the amplitude descrip- 
tion) provides a hint of how conceptually diverse alternative formulations of quantum theory 
can be. Some recent work by Valiant |26| and Terhal and DiVincenzo |2^] identifying an- 
other interesting class of quantum computations that can be efficiently simulated, appears 
to also fit into this framework, utilising a fermionic operator formalism as a mathematical 
description of quantum computational processes. 

Thus in a fundamental sense, the power of quantum computation over classical computa- 
tion ought to derive simultaneously from all possible classical mathematical formalisms for 
representing quantum theory, not any single such formalism and associated quality (such as 
entanglement) i.e. we have arrived at the enigmatic prospect of needing a representation of 
quantum physics that does not single out any particular choice of mathematical formalism. 
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